|
The Lesk algorithm is a classical algorithm for word sense disambiguation introduced by Michael E. Lesk in 1986. 〔 Lesk, M. (1986). (Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone ). In SIGDOC '86: Proceedings of the 5th annual international conference on Systems documentation, pages 24-26, New York, NY, USA. ACM. 〕 ==Overview== The Lesk algorithm is based on the assumption that words in a given "neighborhood" (section of text) will tend to share a common topic. A simplified version of the Lesk algorithm is to compare the dictionary definition of an ambiguous word with the terms contained in its neighborhood. Versions have been adapted to use WordNet.〔Satanjeev Banerjee and Ted Pedersen. ''(An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet )'', Lecture Notes In Computer Science; Vol. 2276, Pages: 136 - 145, 2002. ISBN 3-540-43219-1 〕 An implementation might look like this: # for every sense of the word being disambiguated one should count the amount of words that are in both neighborhood of that word and in the dictionary definition of that sense # the sense that is to be chosen is the sense which has the biggest number of this count A frequently used example illustrating this algorithm is for the context "pine cone". The following dictionary definitions are used: PINE 1. kinds of evergreen tree with needle-shaped leaves 2. waste away through sorrow or illness CONE 1. solid body which narrows to a point 2. something of this shape whether solid or hollow 3. fruit of certain evergreen trees As can be seen, the best intersection is Pine #1 ⋂ Cone #3 = 2. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Lesk algorithm」の詳細全文を読む スポンサード リンク
|